Parsers in TEX and using CWEB for general pretty-printing
نویسندگان
چکیده
The need to process formally structured languages inside TEX documents is neither new nor uncommon. Several graphics extensions for TEX (and LTEX) have introduced a variety of small specialized languages for their purposes that depend on simple (and not so simple) interpreters coded as TEX macros. A number of pretty-printing macros take advantage of different parsing techniques to achieve their goals (see [Go], [Do], and [Wo]). Efforts to create general and robust parsing frameworks inside TEX go back to the origins of TEX itself. A well-known BASIC subset interpreter, BASIX (see [Gr]) was written as a demonstration of the flexibility of TEX as a programming language and a showcase of TEX’s ability to handle a variety of abstract data structures. On the other hand, a relatively recent addition to the LTEX toolbox, l3regex (see [La]), provides a powerful and very general way to perform regular expression matching in LTEX, which can be used (among other things) to design parsers and scanners. Paper [Go] contains a very good overview of several approaches to parsing and tokenizing in TEX and outlines a universal framework for parser design using TEX macros. In an earlier article (see [Wo]), Marcin Woliński describes a parser creation suite paralleling the technique used by CWEB (CWEB’s ‘grammar’ is hard-coded into CWEAVE, whereas Woliński’s approach is more general). One commonality between these two methods is a highly customized tokenizer (or scanner) used as the input to the parser proper. Woliński’s design uses a finite automaton as the scanner engine with a ‘manually’ designed set of states. No backing up mechanism was provided, so matching, say, the longest input would require some custom coding (it is, perhaps, worth mentioning here that a backup mechanism is all one needs to turn any regular language scanner into a general CWEB-type parser). The scanner in [Go] was designed mainly with efficiency in mind TUGboat, Volume 35 (2014), No. 1 71
منابع مشابه
Analysis of Literate Programs from the Viewpoint of Reuse
Donald Knuth created the WEB system for literate programming when he wrote the second version of TEX, a book-quality formatting system. Levy later created CWEB, which is based on Knuth’s WEB using the C programming language and supporting development using the C and C++ programming languages. Krommes’ FWEB is based on CWEB and supports several programming languages. We analyze some parts of the...
متن کاملFunctional Pearl: Replaying the stack for parsing and pretty printing
Modulo inessential details, parsers and pretty printers, to and from algebraic datatypes, offer an uncanny resemblance and yet are all too often defined separately, in gross violation of the “don’t repeat yourself” principle. We present a family of reversible parser/printer combinators that allows one to define both at once in a type-safe manner, compositionally and without any need for a prepr...
متن کاملAdding Native Language Support to the CWEB package and the TEX program
By adding National Language Support (NLS, for short) to literate programs I propose making such changes in their text via change files, which make modified programs aware of and able to support multiple languages. This paper describes how the GNU libc and gettext libraries were used to add NLS to the CWEB package and presents a possible way of bringing NLS to the TEX program. have one CWEB that...
متن کاملPascal pretty - printing : an example of " preprocessing w i t h TEX
Pretty-printing a piece of Pascal code with TEX is often done via an external preprocessor. Actually, the job can be done entirely in TEX; this paper introduces PPP, a Pascal pretty-printer environment that allows you to typeset Pascal code by simply typing \Pascal {Pascal code) \endpascal. The same approach of "preprocessing w i t h TEX" namely two-token tail-recursion around a \FIND-like macr...
متن کاملFeatherweight TeX and Parser Correctness
TEX (and its LTEX incarnation) is a widely used document preparation system for technical and scientific documents. At the same time, TEX is also an unusual programming language with a quite powerful macro system. Despite the wide range of TEX users (especially in the scientific community), and despite a widely perceived considerable level of “pain” in using TEX, there is almost no research on ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014